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Strategy changes are an essential part of evolutionary games. Here we introduce a simple rule that, depending 
on the value of a single parameter w, influences the selection of players that are considered as potential sources 
of the new strategy. For positive w players with high payoffs will be considered more likely, while for negative 
w the opposite holds. Setting w equal to zero returns the frequently adopted random selection of the opponent. 
We find that increasing the probability of adopting the strategy from the fittest player within reach, i.e. setting 
w positive, promotes the evolution of cooperation. The robustness of this observation is tested against different 
levels of uncertainty in the strategy adoption process and for different interaction network. Since the evolution 
to widespread defection is tightly associated with cooperators having a lower fitness than defectors, the fact 
that positive values of w facilitate cooperation is quite surprising. We show that the results can be explained 
by means of a negative feedback effect that increases the vulnerability of defectors although initially increasing 
their survivability. Moreover, we demonstrate that the introduction of w effectively alters the interaction network 
and thus also the impact of uncertainty by strategy adoptions on the evolution of cooperation. 



PACS numbers: 02.50.Le, 87.23.-n, 89.65.-s 



I. INTRODUCTION 

Cooperation within groups of selfish individuals is ubiq- 
uitous in human and animal societies. To explain and un- 
derstand the origin of this phenomenon, evolutionary games, 
providing a suitable theoretical framework, have been studied 
extensively by many researches from various disciplines over 
the past decades |lll-[3|]. The evolutionary prisoner's dilemma 
game in particular, illustrating the social conflict between co- 
operative and selfish behavior, has attracted considerable at- 
tention both in theoretical as well as experimental studies |3l- 
In a typical prisoner's dilemma p], two players simultane- 
ously decide whether they wish to cooperate or defect. They 
will receive the reward R if both cooperate, and the punish- 
ment P if both defect. However, if one player defects while the 
other decides to cooperate, the former will get the temptation 
T while the latter will get the sucker's payoff S. The ranking 
of these four payoffs is T>R>P>S, from where it is clear that 
players need to defect if they wish to maximize their own pay- 
off, irrespective of the opponent's decision. Resulting is a so- 
cial dilemma, which typically leads to widespread defection. 
To overcome this unfortunate outcome, several mechanisms 
that support the evolution of cooperation have been identified 
(see |6] for a review). 

Of particular renown are the investigations of spatial pris- 
oner's dilemma games, which have turned out to be very inspi- 
rational over decades. In the first spatial prisoner's dilemma 
game introduced by Nowak and May ||3], players were located 
on a square lattice, and their payoffs were gathered from the 
games with their neighbors. Subsequently, players were al- 
lowed to adopt the strategy of their neighbors, providing their 
fitness was higher. It was shown that the introduction of spa- 
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tial Structure enables cooperators to form clusters, thereby 
promoting the evolution of cooperation. Along this pioneer- 
ing line of research, many different mechanisms aimed at sus- 
taining cooperation were subsequently proposed and investi- 
gated. Examples include the reward mechanism |18|,|9|], simul- 
taneous adoption of different strategies dependin g on the op- 
ponents 1 10], preferential selection of a neighbor II11I - U5I1 . the 
mobility of players II16I-I241. heterogeneous teaching activity 
1,25, 261 . differences in evolutionary time scales 127. 281 . neu- 
tral evolution ll29ll . and coevolutionary selection of dynamical 
rules [30,31], to name but a few. Looking at some examples 
more specifically, in a recent research paper 1 32], where play- 
ers were allowed to either adjust their strategy or switch their 
defective partners, an optimal state that maximizes coopera- 
tion was reported. In jlSl [1911 it was shown that the mobility 
of players can lead to an outbreak of cooperation, even if the 
conditions are noisy and don't necessarily favor the spreading 
of cooperators. Inspired by these successful research efforts, 
an interesting question posses itself, which we aim to address 
in what follows. Namely, if we consider a simple addition to 
the prisoner's dilemma game that allows players to aspire to 
the fittest, i.e. introducing the propensity of designating the 
most successful neighbor as being the role model, is this ben- 
eficial for the evolution of cooperation or not? The answer 
is not straightforward since, as we have mentioned, defectors 
spread by means of their higher fitness. Thus, the modifica- 
tion we consider might give them higher chances of replica- 
tion. In the early pioneering works, Nowak et al. |33l [3411 
have shown that increasing the probability to copy high pay- 
off neighbors asymptotically leads to increased cooperation, 
yet this dependence was not monotonic over the whole pa- 
rameter range. Here we aim to investigate this further in the 
presence of different levels of uncertainty by strategy adop- 
tions and provide an interpretation of reported results. 

Aside from the progress in promoting cooperation de- 
scribed above, another very important development came 
from replacing the square lattice with more complex inter- 



action topologies (see [35^ for a review), possibly reflecting 
the actual state in social networks more closely. Recently, 
many studies have attested to the fact that complex networks 
play a critical role in the maintenance of cooperation for a 
wide range of parameters 136-451 . Quite remarkably, in the 
early investigations, it has been discovered that the scale-free 
network can greatly elevate the survivability of cooperators if 
compared to the classical square lattice |37]. Following this 
discovery, many studies have built on it in order to extend 
the scope of cooperation on complex networks. For example, 
a high value of the clustering coefficient was found benefi- 
cial |46], while payoff normalization was found to impair the 
evolution of cooperation 11471 - 14911 . Motivated by these stud- 
ies, we examine also how aspiring to the fittest in the pris- 
oner's dilemma game fares on complex networks; in particu- 
lar, whether it promotes or hinders the evolution on coopera- 
tion. 

Here we thus study the prisoner's dilemma game with the 
introduction of a mechanism that allows players to aspire to 
the fittest. Comparing with previous works ll40ll50tl . where a 
neighbor was chosen uniformly at random from all the neigh- 
bors, the propensity of designating the most successful neigh- 
bor as the role model is the most significant difference. Our 
aim is to study how this mechanism affects the evolution of 
cooperation on the square lattice, as well as on the scale-free 
network and the random regular graph, for different levels of 
uncertainty by strategy adoptions. By means of systematic 
computer simulations we demonstrate, similarly as was re- 
ported already by Nowak et al. 11331 [34ll . that this simple mech- 
anism can actually promote the evolution of cooperation sig- 
nificantly. We give an interpretation of the observed phenom- 
ena and examine the impact of different levels of uncertainty 
by strategy adoptions and the impact of different interaction 
networks on the outcome of the modified prisoner's dilemma. 
In the remainder of this paper we will first describe the con- 
sidered evolutionary game, subsequently we will present the 
main results, and finally we will summarize our conclusions. 



II. EVOLUTIONARY GAME 

We consider an evolutionary prisoner's dilemma game with 
the temptation to defect T = b (the highest payoff received by 
a defector if playing against a cooperator), reward for mutual 
cooperation R = b — c, the punishment for mutual defection 
P = 0, and the sucker's payoff S = —c (the lowest payoff 
received by a cooperator if playing against a defector). For 
positive 6 > c we have T > R > P > S, thus strictly sat- 
isfying the prisoner's dilemma payoff ranking. For simplic- 
ity, but without loss of generality, the payoffs can be rescaled 
such that R = 1, T = 1 + r, S = ~r and P = 0, where 
r — c/{b — c) is the cost-to-benefit ratio [w]. Depending on 
the interaction network, the strategy adoption rule and other 
simulation details (see e.g. JSSllSll 15211 ). there always exists 
a critical cost-to-benefit ratio r = Tc at which cooperators die 
out. We will be interested in determining to what extend does 
aspiring to the fittest, as we are going to introduce in what fol- 
lows, affects this critical value under different circumstances. 



Throughout this work each player x is initially designated 
either as a cooperator {sx —C) or defector (D) with equal 
probability. As the interaction network, we use either a regular 
L X L square lattice, the random regular graph constructed as 
described in 1531 . or the scale-free network with L^ nodes and 
an average degree of four generated via the Barabasi- Albert 
algorithm [54]. The game is iterated forward in accordance 
with the sequential simulation procedure comprising the fol- 
lowing elementary steps. First, player x acquires its payoff p^; 
by playing the game with all its neighbors. Next, we evaluate 
in the same way the payoffs of all the neighbors of player x 
and subsequently select one neighbor y via the probability 



n„ 



exp(wpj^) 



(1) 



where the sum runs over all the neighbors of player x and w 
is the newly introduced selection parameter Evidently, for 
w = the most frequently adopted situation is recovered 
where player y is chosen uniformly at random from all the 
neighbors of player x. For w > 0, however, Eq. (1) intro- 
duces a preference towards those neighbors of player x that 
have a higher pay off py. Conversely, for w < players with a 
lower payoff are more likely to be selected as potential strat- 
egy donors. Lastly then, player x adopts the strategy Sy from 
the selected player y with the probability 
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l + exp[{px -Py)/Ky 
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where K denotes the amplitude of noise or its inverse {1/ K) 
the so-called intensity of selection [50]. Irrespective of the 
value of w one full iteration step involves all players x — 
1,2, ... ,L^ having a chance to adopt a strategy from one 
of their neighbors once. Here the evolutionary prisoner's 
dilemma game is thus supplemented by a selection parameters 
w, enabling us to tune the preference towards which neighbor 
will be considered more likely as a potential strategy donor 
For positive values of w the players are more likely to aspire 
to their most fittest neighbors, while for negative values of w 
the less successful neighbors will more likely act as strategy 
donors. This amendment seems reasonable and is easily jus- 
tifiable with realistic examples. For example, it is a fact that 
people are, in general, much more likely to follow a success- 
ful individual than someone who is struggling to get by. This 
is taken into account by positive values of w. However, under 
certain (admittedly rare) circumstances, it is also possible that 
individuals will be inspired to copy their less successful part- 
ners. Indeed, the most frequently adopted random selection 
of a neighbor, retrieved in our case by w = 0, seems in many 
ways like the least probable alternative. It is also informative 
to note that aspiring to the fittest becomes identical to the fre- 
quently adopted "best takes all" rule if li; — ;> oo in Eq. (1) and 
K ^ in Eq. (2). This rule was adopted in the seminal work 
by Nowak and May JTIJ, as well as subsequently by Huberman 
and Glance fSy] who showed that under certain circumstances 
asynchronous updating is substantially less successful in en- 
suring the survivability of cooperators than synchronous up- 
dating. Although in our simulations we never quite reach the 




FIG. 1 : (color online) Characteristic snapsliots of cooperators [red 
(dark gray in black-wliite print)] and defectors [light blue (light gray 
in black-white print)] for different values of the selection parameter 
w. From top left to bottom right w = -0.2, 0, 0.2, 0.5, 1.0 and 4.0, 
respectively. All panels depict results obtained for r — 0.022 and 
iC' = 0.1 on a 100 x 100 square lattice. 



"best takes all" limit, and thus a direct comparison is some- 
what circumstantial, it is interesting to note that an additional 
uncertainty in the strategy adoption process via finite values of 
K may alleviate the disadvantage that is due to asynchronous 
updating [SCy. 

Results of computer simulations presented below were ob- 
tained on populations comprising 100 x 100 to 400 x 400 
individuals, whereby the fraction of cooperators pc was de- 
termined within 10'^ full iteration steps after sufficiently long 
transients were discarded. Moreover, since the preferential 
selection of neighbors may introduce additional disturbances, 
final results were averaged over up to 40 independent runs for 
each set of parameter values in order to assure suitable accu- 
racy. 
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FIG. 2: (color online) Top panel: Frequency of cooperators pc in 
dependence on the cost-to-benefit ratio r for different values of the 
selection parameter w. From left to right w — —0.2, 0, 0.5, 1.0, 
2.0 and 4.0, respectively. Note that negative values of w impair the 
evolution of cooperation, while w > move the survivability of co- 
operators towards larger values or r. Bottom panel: Critical threshold 
values of the cost-to-benefit ratio r = r^ marking the transition to 
the pure D phase (extinction of cooperators), in dependence on the 
selection parameter w. Note that Vc converges in both the negative 
and the positive limit of w. In particular, Vc -^ for negative and 
Tc — >■ 0.35 for positive values of w. Depicted results were obtained 
for K — 0.1 (both panels). 



III. RESULTS 

We start by visually inspecting characteristic spatial distri- 
butions of cooperators and defectors for different values of the 
selection parameter w. Figure [T] features the results obtained 
for r — 0.022 and K ~ 0.1, whereat for w — (upper mid- 
dle panel) a small fraction of cooperators can prevail on the 
square lattice by means of forming clusters, thereby protect- 
ing themselves against the exploitation by defectors |56]. As 
evidenced in the upper leftmost panel, for negative values of 
w even this small fraction of cooperators goes extinct, thus 
yielding as a results exclusive dominance of defectors. For 
positive values of w (upper right panel), however, the coop- 
erators start mushrooming, whereby clustering remains their 
mechanism of spreading and survivability. Interestingly, large 
enough values of w can facilitate the evolution of cooperation 
to the point of near-complete cooperator dominance (bottom 
right panel), or at least equality with the defectors, as implied 
by Pc > Pd in all lower panels of Fig. [T] These results suggest 
that when players aspire to adopt the strategy from their fittest 



neighbor the evolution of cooperation thrives. In what follows 
we will systematically examine the validity of this claim. 

To quantify the abiUty of particular values of the selection 
parameter to facilitate and maintain cooperation more pre- 
cisely, we first calculate pc in dependence on the cost-to- 
benefit ratio r for different values of w. Results presented 
in the top panel of Fig.|2]clearly attest to the fact that positive 
values of w promote the evolution of cooperation, while on 
the other hand, negative values of w impede it. Note that the 
critical cost-to-benefit ratio r — Vc, marking the extinction of 
cooperators, increases by a full order of magnitude at w = 4.0 
(orange stars) if compared to the w = (black squares) case. 
Interestingly, the promotive effect on the survivability of co- 
operators becomes more potent monotonously with increasing 
w, thus suggesting that a universally applicable mechanism is 
underlying the observed behavior. Indeed, the monotonous in- 
crease of r — Tc for increasing w is obvious from the bottom 
panel of Fig.|2] showing concisely the extend to which aspir- 
ing to the fittest promotes the evolution of cooperation on the 
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FIG. 3: (color online) Frequency of cooperators pc in dependence on 
the cost-to-benefit ratio r for different values of the selection param- 
eter w for the random regular graph (RRG) and the scale-free (SF) 
network. From left to right w — —0.2, 0, 1.0 for the random regular 
graph, and w = —0.2, 0, 1.0 for the scale-free network, respectively. 
Note that these results are in qualitative agreement with those ob- 
tained on the square lattice in that negative values of w impair the 
evolution of cooperation, while w > move the survivability of co- 
operators towards larger values or r. Depicted results were obtained 
fori^ = 0.1. 



square lattice. 

Importantly, qualitatively identical results can be obtained 
on interaction networks other than the square lattice. Results 
presented in Fig.[3]depict how cooperators fare on the random 
regular graph and the scale-free network for different values 
of w. Similarly as in Fig. |2] it can be observed that positive 
values of w promote the evolution of cooperation. Conversely, 
negative values of w promote the evolution of defection. This 
is in agreement with the observations made on the square lat- 
tice, thus designating ui > as being universally effective in 
promoting the evolution of cooperation, in particular, working 
on regular lattices and graphs as well as highly heterogeneous 
networks. Since the latter have been identified as potent pro- 
moters of cooperation on their own right Ii37i1 . this conclusion 
is all the more inspiring. 

In order to explain the promotive impact of positive values 
of w on the evolution of cooperation, we examine time courses 
of Pc for different values of the selection parameter Figure|4] 
features results obtained for r = 0.03 and K = Q.l, whereat 
cooperators die out if w = (black line; see also Fig.|2]i. For 
positive values of w, on the other hand, the stationary state is a 
mixed Ch-D phase with cooperators occupying the larger por- 
tion of the square lattice. Interestingly, however, in the most 
early stages of the evolutionary process (note that values of pc 
were recorded also in-between full iteration steps) it appears 
as if defectors would actually fare better for w > 0. In fact, 
the larger the value of w, the deeper the initial downfall of co- 
operators. This is actually what one would expect, given that 
defectors are, as individuals, more successful than coopera- 
tors and will thus be chosen more likely as potential strategy 
donors if w is positive. This in turn amplifies their chances of 



FIG. 4: (color online) Time courses depicting the evolution of co- 
operation for w = Q (solid black line), w = 1.0 (dashed gray line), 
w — 2.0 (dotted blue line) and w = 4.0 (dash-dotted orange line). 
Note that while for lu = cooperators die out, for lu > they 
recover from what appears to become an even faster extinction to 
eventually rise to near-dominance. Notably, the stronger the initial 
temporary downfall, the better the recovery (see also the inset). All 
time courses were obtained as averages over 20 independent realiza- 
tions for r = 0.03 and K = 0.1 on a 400 x 400 square lattice. 
Note that the horizontal axis is logarithmic and that values of pc 
were recorded also in-between full iteration steps to ensure a proper 
resolution. 



spreading and results in the decimation of cooperators (only 
slightly more than 20 % survive). Quite surprisingly though, 
the tide changes fast, and as one can observe from the pre- 
sented time courses, the more so the deeper the initial down- 
fall of cooperators. For w — 4.0 we can observe instead of co- 
operator extinction their near-dominance with pc hoovering 
comfortably over 0.8 (orange line). We argue that for posi- 
tive values of w a negative feedback effect occurs, which halts 
and eventually reverts what appears to be a march of defec- 
tors towards their undisputed dominance. Namely, in the very 
early stages of the game defectors are able to plunder very 
efficiently, which quickly results in a state where there are 
hardly any cooperators left to exploit. Consequently, the few 
remaining clusters of cooperators start recovering lost ground 
against weakened defectors. Crucial thereby is the fact that 
the clusters formed by cooperators are impervious to defector 
attacks even at high values of r because of the positive se- 
lection towards the fittest neighbors acting as strategy sources 
(occurring for w > 0). In a sea of cooperators this is practi- 
cally always another cooperator rather than a defector trying 
to penetrate into the cluster. This newly identified mechanism 
ultimately results in widespread cooperation that goes beyond 
what can be warranted by the spatial reciprocity alone (see 
e.g. 13511 ). and this irrespective of the underlying interaction 
network. As such, aspiration to the fittest, i.e. the propensity 
of designating the most successful neighbor as being the role 
model, may be seen as a universally applicable promoter of 
cooperation. 

Lastly, it is instructive to examine the evolution of cooper- 
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FIG. 5 : Full r~K phase diagram for to = (top panel) and ui = 2 . 
(bottom panel), obtained via systematic simulations of the prisoner's 
dilemma game on the square lattice. Dashed blue and dotted red lines 
mark the border between stationary pure C and D phases and the 
mixed C-l-D phase, respectively. In agreement with previous works 
I38 , l57n . it can be observed that for w = (top panel) there exists 
an intermediate uncertainty in the strategy adoption process (an in- 
termediate value of K) for which the survivability of cooperators is 
optimal, i.e. Vc is maximal. Conversely, while the borderline separat- 
ing the pure C and the mixed C-l-D phase for the w = 2.0 case (bot- 
tom panel) exhibits a qualitatively identical outlay as for the w — 
case, the D -f-j- C-l-D transition is qualitatively different. Note that in 
the bottom panel there exist an intermediate value of K for which re 
is minimal rather than maximal, while towards the large K limit Vc 
increases, saturating only for if > 4 (not shown). 



ation for w > in dependence on the uncertainty by strategy 
adoptions. The latter can be tuned via K, which acts as a 
temperature parameter in the employed Fermi strategy adop- 
tion function |50]. Accordingly, when iiT — > oo all informa- 
tion is lost and the strategies are adopted by means of a coin 
toss. The phase diagram presented in the top panel of Fig.|5]is 
well-known, implying the existence of an optimal level of un- 
certainty for the evolution of cooperation, as was previously 
reported in iH,!!!]. In particular, note that the D ^-> Ch-D tran- 
sition line is bell shaped, indicating that K w 0.37 is the opti- 
mal temperature at which cooperators are able to survive at the 
highest value of r. This phenomenon can be interpreted as an 
evolutionary resonance [59], albeit it can only be observed on 
interaction topologies lacking overlapping triangles 1157116011 . 
Interestingly, positive w eradicate (as do interaction networks 
incorporating overlapping triangles) the existence of an opti- 
mal K, as can be observed from the phase diagram presented 
in the bottom panel of Fig. |5] The latter was obtained for 



w = 2.0 and exhibits an inverted bell-shaped D f-> C-l-D tran- 
sition line, indicating the existence of the worst rather than an 
optimal temperature K for the evolution of cooperation. This 
in turn implies that introducing a preference towards the fittest 
neighbors effectively alters the interaction network. While the 
square lattice obviously lacks overlapping triangles and thus 
enables the observation of an optimal K, trimming the likeli- 
hood of who will act as a strategy source seems to effectively 
enhance linkage among essentially disconnected triplets and 
thus precludes the same observation. A similar phenomenon 
was observed recently in public goods games, where the joint 
membership in large groups was also found to alter the effec- 
tive interaction network and thus the impact of uncertainly on 
the evolution of cooperation 116011 . 



IV. SUMMARY 

In sum, we have shown that aspiring to the fittest promotes 
the evolution of cooperation in the prisoner's dilemma game 
iiTespective of the underlying interaction network and the un- 
certainty by strategy adoptions. The essence of the identified 
mechanism for the cooperation promotion has been attributed 
to a negative feedback effect, occurring because of the for- 
mation of extremely robust clusters (or groups on complex 
networks) of cooperators that are impervious to defector at- 
tacks even at high temptations to defect. Although initially 
the defectors appear to be heading to an undisputed victory, 
the fast exploitation and the consequent shortage of coopera- 
tors weakens the defectors and makes them susceptible to an 
overtake by the few remaining cooperators. Further interest- 
ing is the fact that the introduction of a selection parameter, 
making the fittest neighbors more likely to act as sources of 
adopted strategies, effectively alters the interaction network. 
While in its absence there exists an intermediate uncertainty 
governing the process of strategy adoptions K by which the 
largest cost-to-benefit ratio r still warrants the survival of at 
least some cooperators, in its presence this feature vanishes 
and becomes qualitatively identical to what was observed pre- 
viously on lattices that do incorporate overlapping triangles, 
such as the kagome lattice |60]. Since in fact the actual in- 
teraction topology remains unaffected by the different values 
of the selection parameter w, we have argued that the differ- 
ences in the evolution of cooperation are due to an effective 
transition of the interaction topology, which is brought about 
by the fact that some players are more likely to act as strat- 
egy sources than others. Therefore, the bonds between certain 
player pairs appears stronger than average, although the inter- 
action networks consist of links that are not weighted. 

Since aspiring to the fittest, i.e. the propensity of designat- 
ing the most successful neighbors as role models, appears to 
be both widely applicable as well realistically justifiable, we 
hope it will inspire future studies, especially in terms of un- 
derstanding the emergence of successful leaders in societies 
via a coevolutionary process [52]. An interesting interpreta- 
tion of the selection parameter w can also be obtained if the 
latter is considered as a measure of cognitive complexity of 
each individual. In particular, it is possible to argue that the 



more obtuse an individual is, the closer to random his choice 
of a role model will be. If individuals are to be able to aspire to 
the fittest, they should have some degree of information pro- 
cessing capabilities. On the other hand, negative values of w 
can be interpreted as a choice that is based on moral values 
ll6lll . for example, when highly successful individuals are so 
by unethical actions and thus should not be imitated. 
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